Protein ß-turn prediction using nearest-neighbor method
نویسنده
چکیده
Motivation: With the emerging success of protein secondary structure prediction through the applications of various statistical and machine learning techniques, similar techniques have been applied to protein β-turn prediction. In this study, we perform protein β-turn prediction using a k -nearest neighbor method, which is combined with a filter that uses predicted protein secondary structure information. Traditional β-turn prediction from k -nearest neighbor method is modified to account for the unbalanced ratio of the natural occurrence of β-turns and non-β-turns. Results: Our prediction scheme is tested on a set of 426 non-homologous protein sequences. The prediction scheme consists of two stages: k -nearest neighbor method stage and filtering stage. Variations of the k -nearest neighbor method were used to take property of β-turns into consideration. Our filtering method uses β-turn/non-β-turn estimates from the k -nearest neighbor method stage and predicted protein secondary structure information from PSI-PRED in order to get new β-turn/non-β-turn estimate. Our result is compared with the previously best known β-turn prediction method on the dataset of 426 non-homologous protein sequences and is shown to give slightly superior performance at significantly lower computational complexity. Availability: Contact the author for information on the source code of the programs used. Contact: [email protected]
منابع مشابه
Liquid-liquid equilibrium data prediction using large margin nearest neighbor
Guanidine hydrochloride has been widely used in the initial recovery steps of active protein from the inclusion bodies in aqueous two-phase system (ATPS). The knowledge of the guanidine hydrochloride effects on the liquid-liquid equilibrium (LLE) phase diagram behavior is still inadequate and no comprehensive theory exists for the prediction of the experimental trends. Therefore the effect the ...
متن کاملDiabetes Prediction by Optimizing the Nearest Neighbor Algorithm Using Genetic Algorithm
Introduction: Diabetes or diabetes mellitus is a metabolic disorder in body when the body does not produce insulin, and produced insulin cannot function normally. The presence of various signs and symptoms of this disease makes it difficult for doctors to diagnose. Data mining allows analysis of patients’ clinical data for medical decision making. The aim of this study was to provide a model fo...
متن کاملDiabetes Prediction by Optimizing the Nearest Neighbor Algorithm Using Genetic Algorithm
Introduction: Diabetes or diabetes mellitus is a metabolic disorder in body when the body does not produce insulin, and produced insulin cannot function normally. The presence of various signs and symptoms of this disease makes it difficult for doctors to diagnose. Data mining allows analysis of patients’ clinical data for medical decision making. The aim of this study was to provide a model fo...
متن کاملDrought Monitoring and Prediction using K-Nearest Neighbor Algorithm
Drought is a climate phenomenon which might occur in any climate condition and all regions on the earth. Effective drought management depends on the application of appropriate drought indices. Drought indices are variables which are used to detect and characterize drought conditions. In this study, it was tried to predict drought occurrence, based on the standard precipitation index (SPI), usin...
متن کاملPrediction of protein solvent accessibility using fuzzy k-nearest neighbor method
MOTIVATION The solvent accessibility of amino acid residues plays an important role in tertiary structure prediction, especially in the absence of significant sequence similarity of a query protein to those with known structures. The prediction of solvent accessibility is less accurate than secondary structure prediction in spite of improvements in recent researches. The k-nearest neighbor meth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 20 شماره
صفحات -
تاریخ انتشار 2004